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' ^ ' Abstract 

1^ We describe a new B-meson full reconstruction algorithm designed for the Belle 

O ^ experiment at the B-factory KEKB, an asymmetric e+e~ collider that collected 

OJ a data sample of 771.6 x 10^ BB pairs during its running time. To maximize the 

, number of reconstructed B decay channels, it utilizes a hierarchical reconstruc- 

tion procedure and probabilistic calculus instead of classical selection cuts. The 
multivariate analysis package NeuroBayes was used extensively to hold the bal- 
^ ance between highest possible efficiency, robustness and acceptable consumption 

^ of CPU time. 

In total, 1104 exclusive decay channels were reconstructed, employing 71 
neural networks altogether. Overall, we correctly reconstruct one B^ or B^ 
candidate in 0.28% or 0.18% of the BB events, respectively. Compared to the 
cut-based classical reconstruction algorithm used at the Belle experiment, this is 
^-H an improvement in efficiency by roughly a factor of 2, depending on the analysis 

considered. 

^ The new framework also features the ability to choose the desired purity or 

•▼H efficiency of the fully reconstructed sample freely. If the same purity as for the 

KN classical full reconstruction code is desired 25%), the efficiency is still larger 

^ by nearly a factor of 2. If, on the other hand, the efficiency is chosen at a similar 

level as the classical full reconstruction, the purity rises from ~ 25% to nearly 
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1. Full B Meson Reconstruction at B Factories 

1.1. The Experimental Setup 

One of the biggest advantages of lepton colliders like the KEKB or PEP-II 
accelerator compared to hadron accelerators like the Tevatron or the LHC is the 
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precise knowledge of the initial state and the process of B meson production. 
The colliding particles are electrons and positrons. This feature allows for col- 
lisions with well-known energy in the initial state. As the KEKB accelerator [T] 
and the Belle detector [2] were designed to study B meson decays, the center 
of mass energy of the collisions was chosen as ^/s = 10.58 GeV, which corre- 
sponds to the T(4S) resonance. The decay properties of this resonance are very 
important for the full reconstruction: 

1. The T(4S) resonance decays into a B^B~ or B^B^ pair respectively in 
over 96% of all cases [3 without any additional particles. 

2. For the B^B' or B^B^ pairs produced in this two-body decay, the four- 
momenta are related by 

p{B^)+p{B2)=p{e+)+p{e-). (1) 

3. The two B mesons are almost at rest in the center of mass frame of the 
T(4S) 

= 380 MeV/c (2) 

compared to the lighter Mesons and therefore produce a spherical event 
topology. 

There are, however, events where no T(4S), but pairs of light quarks {uu^ dd^ 
ss, or cc) are produced. These events form a continuum background to B meson 
pair production and ideally are rejected by the analysis. 

The full reconstruction described in this paper was developed for the Belle 
detector [2] a large solid angle magnetic Spectrometer located at the KEKB 
collider [1]. It consists of a silicon vertex detector (SVD), a 50-layer central 
drift chamber (CDC), an array of aerogel threshold Cerenkov counters (ACC), 
a time-of-flight scintillation counter (TOF) and an electromagnetic calorimeter 
composed of Cs(Tl) crystals (ECL). All these detectors are surrounded by a 
superconducting solenoid, providing a 1.5 T magnetic field and an iron fiux- 
return which is instrumented to detect mesons and to identify muons (KLM). 

1.2. The Full Reconstruction 

The main goal and also the main difficulty of the full reconstruction is to 
take any event and try to reconstruct one of the B mesons in one of many 
different decay channels. Should this attempt succeed, it is possible to assign 
all the tracks and electromagnetic clusters used in the reconstruction to this 
one B meson. As it is completely reconstructed, its 4-momentum is known. 
We call the fully reconstructed B meson, the ^tag- After reconstruction of the 
tag side, it is possible to assign all the remaining tracks and electromagnetic 
clusters within the detector to the other B meson, which we call the B^ig (see 
figure [T]). This ^sig meson actually is the object of interest for physics analyses, 
as explained below. 
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Figure 1: Exemplary fully reconstructed event. The Bsig (signal side) is the decay of physics 
interest, while the Btag (tag side) is the other B meson, reconstructed by the full reconstruction 
method. 

We can be sure that there are no additional particles produced by the e+e~ 
collision within the detector, as the T(4S) resonance decays into two B mesons 
only. In this two-body decay, we can obtain the momentum of the ^sig without 
any additional analysis once the ^tag is known. This follows by applying 4- 
momentum conservation as given by equation [l] 

This procedure might seem rather involved at first glance, but has the benefit 
that it yields information, otherwise inaccessible, about a hard or impossible to 
reconstruct B decay on the signal side. A prominent example for the application 
of the full reconstruction is a 5 meson decay including neutrinos where the decay 
kinematics can otherwise not be fully constrained or a decay with very large non- 
55 background. Many of these decays are very sensitive to small contributions 
from new physics and thus it is important to adopt powerful reconstruction 
algorithms for them. Examples for the application of the full reconstruction 
include: 



5+ ^ T^Ur (3) 

5+ ^ D^'^^r^iyr (4) 

5+ ^ K^uu (5) 

B^ vv (6) 

B XJ^v (7) 



One possible topology of the first decay is given in figure [T] where the r lepton 
decays into an electron and two neutrinos. 

The most important practical difference between the full reconstruction 
method and most analyses is just the sheer number of decay channels for the 
tag side. As there are several hundreds of known B decay channels, the task of 
reconstructing one of the two B mesons in the event cannot always succeed. Ad- 
ditionally, most of those decay channels include other unstable particles, mostly 

and D mesons, which also decay in a vast spectrum of decay channels that 
also have to be reconstructed. 
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The quantity that has to be maximized by the fuh reconstruction method is 
the total B reconstruction efficiency 

N 

etot = ^£i-Bi, (8) 

where N is the number of reconstructed B decay channels, £i is the reconstruc- 
tion efficiency of the decay channel i and Bi is the branching fraction of the 
decay channel i. The typical scale for Bi is 10~^ to 10~^ and typically Si is of 
the order of 10%. As the Bi is fixed by nature, we can maximize Stot only by 
increasing ei and the number of reconstructed decay channels N. In order to in- 
crease Si^ multivariate techniques are used (see chapter [2|. The main challenge 
is to keep track of all the used variables in these multivariate methods, partic- 
ularly because we want to reconstruct as many decay channels as possible. For 
this we had to develop a software framework which gives us the possibility to 
automatically manage hundreds of decay channels with extensive usage of mul- 
tivariate methods. The automatic handling of many steps allows to minimize 
human errors. 



2. Multivariate Techniques 

A common technique to achieve more sophisticated selections is to combine 
all significant variables available into a single scalar variable, for example a 
likelihood ratio, and to perform a cut on this new variable. These multivariate 
techniques are in principle capable of taking correlations of the variables into 
account. The application of these techniques can, however, be rather involved. 
Simplified models can deliver quite good results when correlations between the 
different variables are small. 

Another example of a multivariate technique is the NeuroBayes package [4] 
that was used extensively for the new full reconstruction tool. The idea of 
the NeuroBayes package is to pass all of the relevant variables, through a pre- 
processing algorithm, to a neural network. For a classification task, to decide 
if a candidate is signal or background, the network maps the input variables to 
a single output variable while taking into account the correlations of the input 
variables. An example of the separation power of this output variable for one 



of the classification task used can be seen in figure 2(a) 



2.1. NeuroBayes Output as a Probability 

As shown in figure |2(b)[ the purity, defined as the number of signal events 
divided by the total number of events in a network output bin, is a linear 
function of the NeuroBayes output. This indicates that the produced output is 
a good measure of probability for the candidate to be signal. 

If a NeuroBayes training is performed with the same signal to background 
ratio as found on data, the output of the classification can directly be inter- 
preted as a Bayesian probability for signal. While it would be better to train 
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Figure 2: |(a)| The distribution of the NeuroBayes output f or si gnal (red) and background 
(black) for an exemplary classification task of tt^ candidates. |(b)| The purity, obtained from 
the network output distributions shown in Fig. |(a)[ is a linear function of the NeuroBayes 
output. 



the neural network with the same signal to background ratio as expected on 
data, it is sometimes not possible. If, for example, the desired signal is very 
rare in nature, a training would not learn to distinguish the few signal events 
from the millions of background events, but rather try to learn something from 
statistical fluctuations of the background that swamp the signal and therefore 
also dominate the loss function that is minimized during the network training. 
Therefore, a training with a higher signal fraction is the only way, in which 
the selection of such rare signals can be optimized. On the other hand, if we 
artificially increase the signal to background ratio, the network output cannot 
be interpreted as a Bayesian probability any more on the real dataset, because 
the a priori probabilities of being signal or background differ from the train- 
ing dataset. Nevertheless, one can correct the network output in a way that is 
interpretable as a probability again. For this, we need to know the signal to 
background ratio in the training dataset and in the dataset where the network 
should give the prediction. To calculate this correction, we need Bayes' theorem, 
which is defined for two types of events, X and F, as 



For our purposes, it is preferable to use Bayes' theorem in terms of the likelihood 
ratio 
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which leads to prior odds of 



0(X) = ^ , (II) 



and posterior odds given by 

0{X\Y) = 0{X) . A{Y\X) , (12) 

In our example X and are signal events (S) and background events (B) and 
Y is the output (ot) from a network trained with the training dataset (denoted 
with the subscript t). The likelihood ratio is 

where P{ot\S) is the likelihood to get a network output, Ot, given a signal event 
S and P{ot\B) is the same for a background event. Given a network output Ot 
the conditional probability of being a signal event is 

ot = Pt{S\ot), (14) 

and the corresponding probability of being a background event B is given by 

(1 - Ot) = Pt{B\ot) . (15) 

By applying Bayes theorem as follows 

Pt{S\ot) PtiS) 



Pt{B\ot) Pt{B) 
we can write the likelihood ratio as 

^'(0*1^) 



Aiot\S) (16) 



^(^*'")-p(o.|5) 

o. P.iB) ^^^^ 



1-Ot Pt{S) • 

This likelihood ratio does not depend on the signal to background ratio because 
it only contains measured information of one given event. We can now calculate, 
for any other signal to background ratio in the prediction dataset (denoted with 
the subscript p) , the posterior odds with Bayes theorem: 

Because the transformed probability Op has to satisfy 

Pp{S\Op) ^ Op .^gX 

Pp{B\op) 1-Op ^ ' 
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stage 


particles 


1 


tracks, Ks^ 7, tt^ 


2 


D^^y D^^ and J/tJj mesons 


3 


D*^^ and mesons 


4 


5^ and mesons 



Table 1: The 4 stages of the hierarchical system 



to be the correct probability, we get: 



. (20) 

ot PpiS) PtiB) 



This formula is used in the full reconstruction algorithm described in the next 
section to calculate the signal probability for modes with low purity so that the 
signal fraction had to be increased for the network training. 



3. Selection and Reconstruction 

In order to reconstruct as many B meson decays as possible, it is not possible 
to take care of the thousands of exclusive decay channels individually. Instead 
a hierarchical approach was chosen. We divide the reconstruction into 4 stages, 
as shown in table [T] and illustrated in figure [3] 




Figure 3: The 4 stages of the full reconstruction 



One aim of the full reconstruction is to achieve high efficiency. This could in 
theory be done by always reconstructing every possible candidate at all stages 
in an event and then finally taking the best B meson candidate. In practice 
however, the computing power needed to pursue this maximum efficiency strat- 
egy is not available and it is necessary to perform cuts during the selection and 
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reconstruction process. A main principle of this ansatz is to calculate the sig- 
nal probabilities at each stage of the hierarchical system, while cuts on these 
probabilities occur only at a later stage. 

3.1. Data Samples 

For the training we used a Monte- Carlo generated data sample with a full 
detector simulation based on GEANT[5]. It includes e+e~ annihilation to non-6 
quarks (li, 5, c) generated with PYTHIA[6^ and to the T(4S') resonance gener- 
ated with the EvtGen package [7 . The produced mesons then decay inclusively 
to any possible final state governed by the h ^ c transition. 

3.2. The First Stage 

In the first stage, NeuroBayes networks are trained on Monte-Carlo samples 
for charged, long-lived particle type hypotheses (kaon, pion, electron, muon) for 
the measured charged tracks and for the photon hypothesis for each cluster in 
the electromagnetic calorimeter not matched geometrically to a charged track. 
Neutral pion candidates are formed out of two electromagnetic clusters whose 
invariant mass lies within the window 115 MeV/c^ < M(7r^) < 153 MeV/c^. 
Additionally, the energy of the photons that are used to construct the tt^ can- 
didate has to lie above 30 MeV. Candidate Ks particles are formed from two 
charged tracks whose invariant mass lies within 30 MeV of the nominal Kg mass. 
Only very loose preselection criteria on the impact parameter of all tracks and 
the particle identification variable for candidates are applied. As an 
example in the decay i^T" 7r+, a signal efficiency of 96% and a background 

reduction factor of 3.5 could be observed. After these pre-cuts, NeuroBayes 
networks are trained for all particle hypotheses. As an input for the trainings 
of the charged particles, measurements of the time-of-fiight, the energy loss in 
the CDC and Cherenkov light in the ACC are used. For the photon hypothesis, 
several variables to describe the shower shape in the calorimeter are used. 

3.3. The Second Stage 

In the second stage, combinations of two to five candidates from the first 
stage were used to reconstruct D^, D^, Df and J/ip mesons. A list of the 
decay channels used for the reconstruction of these mesons and their respective 
branching fractions can be found in table [2] 

As these trainings were performed on inclusive simulated samples, a large 
fraction of the true D mesons did not come from B meson decays. Since only 
the D mesons from B decays are of interest, a momentum cut in the T(46') rest 
frame was performed: 

< 2.6 GeV/c (21) 

This cut excludes the majority of D mesons not stemming from B decays, i.e. 
from cc-fragmentation. 
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Table 2: Stage 2 - Reconstructed D and J/ifj modes. Branching ratios are from Ref. [S]- 
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3.3.1. Selection Criteria 

In order to retain reasonable computing time, soft pre-cuts are applied at 
this stage. We define the quantity 

N 

^^out.prod = JjNBo^t^^ , (22) 

i 

where is the number of daughters in a given decay and NBo^^t,* is the neural 
network output of z-th daughter, which we use to suppress obvious background. 

The cuts for all decay modes of a particle type were determined simultane- 
ously to optimally use the CPU resources. To explain the determination of the 
cuts, let us focus on D'^ mesons: The cuts were determined for all D'^ modes 
simultaneously. It was required that the additional amount of background that 
would have to be taken into the sample to gain one additional signal event was 
the same for all D'^ modes. This means that very clean channels will get a 
very soft cut and at the same time, more complicated channels will get slightly 
harder cuts so that the consumed computing power is minimized. To determine 
these cuts, for each mode the number of signal events in the sample was 
plotted against the number of background events for the different possible cuts 
on the product of the NeuroBayes outputs of the children. If we now look at the 
slopes of the tangents of these different plots, the same tangent slope indicates 
the same additional number of background events for one additional signal. The 
cut is set at that value where this condition is met. Figure [4] shows a possible 
choice for the slope. 

The steeper this slope is, the higher the efficiency is, but also the higher 
computational effort is needed in these modes. The final choice of the exact 
value of each slope is obviously an arbitrary matter. Our decision for the slopes 
of ^ ^ Df and also those of D* modes in stage 3 were made from the 
point of view of combining these particles to a 5 meson and then getting on 
average much less than one candidate per event. All of the remaining candidates 
for D^, ^ Df and J /^l) mesons were again classified using NeuroBayes. The 
networks comprise a large number of variables. The variables with the largest 
separation power are the product of the NeuroBayes outputs of the children, the 
invariant mass of children pairs and the angle between them, the angle between 
the momentum of the D meson and the line connecting the D decay vertex to 
the interaction point and the significance of the distance of the D meson decay 
vertex to the interaction point. 

Special attention was paid to not include any mass-dependent variable in 
these trainings, so that, if necessary, a check of our D and J/?/^-meson sample 
could be performed by looking at the unbiased mass distributions. Examples of 
few intermediate results for a few stage 2 channels are shown in figures [5] and [6| 

In the hierarchical system, it is very important that all the outputs of the 
NeuroBayes trainings actually represent their signal probabilities, so that the 
cuts performed, and later on the ranking of candidates from different decay 
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Figure 4: The signal-background plots for the cut determination. The black dots are our 
cutting points and all the lines have the same slope in these points. (For colored lines, see the 
online version of this paper) 
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mode BR 
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Table 3: Stage 3 - all D* modes (BR from [3]) 



channels, are meaningful and correct. Had we used the original signal to back- 
ground ratio during the trainings, this would have been automatically correct. 
Because of a too high background level, this was not possible for some channel 
trainings, so that we artificially increased the signal component to reach at least 
10%. This resulted in the need to recalculate the NeuroBayes output after the 
classification to account for the artificially enhanced signal component during 
training as explained in section |2]T] 

34. The Third and Fourth Stage 

The same procedure of preselection, training and recalculating was then re- 
peated for D*^^ and D*^ mesons in stage 3 (for the channels and their branching 
ratios, see table [3]) and finally for and mesons in stage 4. Variables with 
good discrimination power were again the product of the NeuroBayes outputs 
of the children, the mass of the D meson, the mass difference of the D and D* 
meson and for B meson decays the energy-difference A£^, the angle between the 
B meson and the thrust axis and angles between pairs of children. A list of all 
used B meson decay modes and the corresponding branching ratios can be seen 
in table [ZJ The results of the B^ and B^ meson trainings were used to rank 
the candidates in each event according to their NeuroBayes outputs. The best 
candidate selection is now simply a matter of choosing only the first rank. 

3.5. Suppression of non BB Background 

Non BB events differ from BB events in the event shape. As there is hardly 
any kinetic energy in BB events left, the decay particles are much more spheri- 
cally distributed in contrast to the jet-like structure of non-BB events. There are 
numerous variables to quantify the different event shapes. The reduced second 
Fox- Wolfram Moment R2 [8 gives non-candidate-specific information about the 
event shape, the thrust angle and cos 6b provide information for each individual 
candidate. The Super Fox- Wolfram Moments (SFWM) [9 contain additional 
information about the tag- and signal-side. 

In the default mode of the full reconstruction, no event shape variables are 
used, as this might introduce some bias for certain analyses. There is, however, 
an additional algorithm that can be used after the full reconstruction. This 
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Table 4: Stage 4 - All B modes (BR from [3]) 
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algorithm recalculates the NeuroBayes output of all of the B candidates and 
determines the best candidate again, based on the new output. The algorithm 
incorporates two continuum suppression networks. The first network uses the 
reduced second Fox- Wolfram Moment, the thrust angle and cos 0^- It therefore 
only depends on 5tag- The second network additionally contains the Super 
Fox- Wolfram Moments, depending also on ^sig- As these networks take more 
information into account, there is a significant improvement in the quality of 
the NeuroBayes output and also in the best candidate selection. The results 
can be found in figures [7| and |8] 

4. Performance of the new Algorithm 

4.1. Efficiency Estimation 

There is an existing full reconstruction algorithm at Belle (see e.g. p!QHl4] ). 
using a classical, cut-based reconstruction method without taking probabilistic 
information into account. We compare the performance of the new and the 
classical algorithm by estimating the numbers of correctly reconstructed Btag 
candidates using the final Belle data sample collected at the T(45') resonance. 

The sample contains 771.6 x 10^ BB pairs. The kinematic consistency of a 
5tag candidate with a B meson decay is checked using the beam-energy con- 
strained mass M^c = V ^beam ~ Pb ' where £^beam is the measured beam energy 
and is the reconstructed four- momentum of the B meson in the center- 
of-mass rest frame. None of the variables used in the network trainings are 
correlated with M5C, which can therefore be used to estimate the number of 
correctly reconstructed ^tag candidates from fits to the M^^ distribution. 

Since this paper focuses on the description of the new full reconstruction 
method and its improvements, we do not evaluate systematic uncertainties on 
the fitted signal yields as would be required for physics analyses. Typically any 
full reconstruction tool is used in conjunction with a signal side analysis. For 
most signal side analyses, only the largest possible efficiency of the tag side 
sample is important, as the background is reduced dramatically by the signal 
side selection. When we want to compare two full reconstruction methods by 
themselves, without any signal side selection, we have to perform a fit to the 
inclusive tag side M^c distribution. Especially for the new full reconstruction, 
this distribution contains large amounts of background, which are irrelevant for 
most analyses, but make the fit results less reliable. Therefore it is only possible 
to give a quite raw estimate for the signal gain and therefore for the improve- 
ment compared to the classical full reconstruction for maximum efficiency. This 
is usually not a problem for physics analyses because of the applied selection 
criteria. 

The number of correctly reconstructed B mesons are estimated from the fit 
to be 2.1 million B^ and 1.4 million B^ for the maximum efficiency case. This 
corresponds to an efficiency of roughly 0.18% for B^ and 0.28% for B^. This 
efficiency is defined as the number of correct reconstructed B mesons divided 
by the number of produced BB pairs, which is the same as the number of 
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produced and mesons respectively. Note that in other pubhcations a 
different definition might be used, which takes the number of produced charged 
or neutral B meson pairs as normalization, resulting in twice the value for the 
single B meson reconstruction efficiency. 

In order to get more reliable fit results, we can introduce cuts on the Neu- 
roBayes outputs of the B^ and B^ meson networks, and thereby choose effi- 
ciency and purity freely. Figures [7| and [8] show the resulting purity-efficiency 
plots for the three modes explained in chapter [33) Purity is defined as the ratio 
of the signal component of the fit to the entire fit result integrated over the 
region M^c > 5.27GeV/c^. 

If no cut is performed, the standard selection that gives maximum efficiency 
is used. One can also choose a cut, corresponding to the same purity as in the 
classical full reconstruction tool, which results in an increase of efficiency by 
approximately a factor of 2 , as shown in figu re |9(a)| A cut, corresponding to 
the same background level is shown in figure |9(b) One is also free to choose 
the same efficiency as in the classical full reconstruction. This results in an 
increase in the purity from about 25% to nearly 90% as shown in figures 9(c)| 
and |9(d)[ Any working point between and even beyond these three examples 
can be chosen in a very simple manner (cutting on the output of of the stage 4 
networks) by the user. 
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Figure 7: Purity-efficiency plot for mesons 



4.2. Without new Channels 

If we exclude the newly added D and B decay channels from the full re- 
construction and choose a network output cut to achieve the same background 
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Figure 8: Purity-efficiency plot for B'^ mesons 
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Figure 9: M^c plots for different selections: The dashed blue line is a fit of the M^c distributions 
for the new full reconstruction algorithm, the solid red line to the classical one. The network 
cuts are chosen to have | (a) [ roughly equal purity, [(b)] roughly equal background level, [(c)] |(d)| 
roughly equal efficiency compared to the classical one 
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level as in the classical full reconstruction, the efficiency is increased by ap- 
proximately 50% for mesons and 60% for mesons. A comparison of the 
individual B decay channels revealed that the largest improvement was achieved 
in modes with two or more light mesons, where the new full reconstruction does 
not impose any phase-space limits. The newly added channels make a valuable 
contribution of approximately 20% of the entire signal sample for both B^ and 
B^ mesons. 

4.3. Applied Example: Missing Mass Reconstruction 

In order to test the results of the full reconstruction and also to compare 
the performance to its predecessor, a quick benchmark analysis was performed. 
This was the search for the decay 



B^ D^'-i^Ui (23) 

on the signal side. A kinematic variable used to distinguish correctly recon- 
structed signal candidates from background candidates is the missing mass, 
defined as 



j-yj-YY 



\Pr{4S) 



2 

tag 71 5 



(24) 



where Pr{AS) denotes the four- momentum of the T(4s) resonance, PBtag 
four-momentum of the 5tag and Xl^Pi is the sum of the four- momenta of the 
reconstructed particles on the signal side. Because the neutrino is the only 
missing particle in this decay, we expect the missing mass to be zero for signal 



events. The result can be seen in figure 10(a) for the new full reconstruction 
algorithm and as an comparison in figure 10(b) the result for the classical full 
reconstruction algorithm. A clear peak is observed at the expected position 
with similar resolutions for new and classical full reconstruction. Thus despite 
the addition of less clean decay modes, the momentum resolution of the fully 
reconstructed B meson is preserved. As expected we also observe in this applied 
example a significant increase of efficiency. 



5. Conclusion 

We have developed an improved full reconstruction algorithm for the Belle 
experiment by introducing a hierarchical selection procedure. Instead of cutting 
away candidates in the early stages, we postpone the decision to later stages 
by very soft selections on the product of their Bayesian signal probability and 
giving this probability as an input for the higher stages networks. Together 
with a higher separation power of the neural networks compared to a cut based 
selection, this enabled us to reconstruct more decay channels with an acceptable 
computation time. Depending on the analysis, we expect an overall improve- 
ment of the effective luminosity of roughly a factor of 2 for a large number of 
analyses relying on the full reconstruction. 
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Figure 10: Missing mass distributions for B^ — ^ D* h>i decays of the new and classical 

full reconstruction tool 
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